Multi-Player Bandits - a Musical Chairs Approach

نویسندگان

  • Jonathan Rosenski
  • Ohad Shamir
  • Liran Szlak
چکیده

We consider a variant of the stochastic multi-armed bandit problem, where multiple players simultaneously choose from the same set of arms and may collide, receiving no reward. This setting has been motivated by problems arising in cognitive radio networks, and is especially challenging under the realistic assumption that communication between players is limited. We provide a communication-free algorithm (Musical Chairs) which attains constant regret with high probability, as well as a sublinear-regret, communication-free algorithm (Dynamic Musical Chairs) for the more difficult setting of players dynamically entering and leaving throughout the game. Moreover, both algorithms do not require prior knowledge of the number of players. To the best of our knowledge, these are the first communication-free algorithms with these types of formal guarantees. We also rigorously compare our algorithms to previous works, and complement our theoretical findings with experiments.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A combinatorial analysis of the average time for open-address hash coding insertion

In analysing a well-known hash-coding method, Knuth gave an exact expression for the average number of rejections encountered by players of a variant of musical chairs. We study a variant more closely related to musical chairs itself and deduce the same expression by a purely combinatorial approach. In an analysis of the average time to insert an item when using openaddress hash-coding, Knuth [...

متن کامل

Playing in stochastic environment: from multi-armed bandits to two-player games

Given a zero-sum infinite game we examine the question if players have optimal memoryless deterministic strategies. It turns out that under some general conditions the problem for twoplayer games can be reduced to the same problem for one-player games which in turn can be reduced to a simpler related problem for multi-armed bandits. Digital Object Identifier 10.4230/LIPIcs.FSTTCS.2010.65

متن کامل

Musical chairs

In the musical chairs game MC(n,m), a team of n players plays against an adversarial scheduler. The scheduler wins if the game proceeds indefinitely, while termination after a finite number of rounds is declared a win of the team. At each round of the game each player occupies one of the m available chairs. Termination (and a win of the team) is declared as soon as each player occupies a unique...

متن کامل

Online Multi-Armed Bandit

We introduce a novel variant of the multi-armed bandit problem, in which bandits are streamed one at a time to the player, and at each point, the player can either choose to pull the current bandit or move on to the next bandit. Once a player has moved on from a bandit, they may never visit it again, which is a crucial difference between our problem and classic multi-armed bandit problems. In t...

متن کامل

Multi-Player Bandits Models Revisited

Multi-player Multi-Armed Bandits (MAB) have been extensively studied in the literature, motivated by applications to Cognitive Radio systems. Driven by such applications as well, we motivate the introduction of several levels of feedback for multi-player MAB algorithms. Most existing work assume that sensing information is available to the algorithm. Under this assumption, we improve the state-...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016